Method of prioritizing transmission of spectral components of audio signals

ABSTRACT

A method for the transmission of audio signals between a transmitter and at least one receiver operates according to the prioritizing pixel transmission method. The audio signal is first broken down into a number of spectral fractions. The broken-down audio signal is stored in a two-dimensional array with a plurality of fields. The dimensions to be registered in the field are frequency and time; the value to be registered in the field is amplitude. Groups are then formed from the individual fields and a priority is assigned to the individual groups, in which the priority will be gauged as higher if the amplitudes of the group values are higher, and/or if the amplitude differences of the values of one group are higher, and/or if the groups are closer to actual time. Finally, the groups are transmitted to the receiver according to the order of their established priority.

The invention relates to a method of prioritizing transmission ofspectral components of audio signals.

Currently a multiplicity of methods exists for the compressedtransmission of audio signals. Essentially the following methods areamong them:

-   -   Reduction of the sampling rate, for example 3 kHz instead of 44        kHz    -   Nonlinear transmission of the sampled values, for example in        ISDN transmission    -   Utilization of previously stored acoustic sequences, for example        MIDI or voice simulation    -   Employing Markov models for the correction of transmission        errors.

The commonalities of the known methods reside therein that even at lowertransmission rates satisfactory voice intelligibility is still provided.This is substantially attained through the formation of mean values.However, different voices of the source yield similarly sounding voicesin the lowering, such that, for example voice fluctuations, which aredetectable in normal conversation, are no longer transmitted. Thisresults in a marked restriction in the quality of communication.

Methods for compressing and decompressing of image or video data bymeans of prioritized pixel transmission are described in theapplications DE 101 13 880.6 (corresponding to PCT/DE02/00987), nowissued as U.S. Pat. No. 7,130,347 and DE 101 52 612.1 (corresponding toPCT/DE02/00995). now issued as U.S. Pat. No. 7,359,560. In thesemethods, for example digital image or video data are processed, whichare comprised of an array of individual pixels, each pixel comprising apixel value varying in time, which describes color or brightnessinformation of the pixel. According to the invention, to each pixel oreach pixel group a priority is assigned and the pixels are storedcorresponding to their prioritization in a priority array. This arraycontains at each point in time the pixel values sorted according toprioritization. These pixels and the pixel values utilized for thecalculation of the prioritization are transmitted or storedcorresponding to the prioritization. A pixel receives a high priority ifthe differences to its adjacent pixels are very large. For thereconstruction the particular current pixel values are represented onthe display. The pixels not yet transmitted are calculated from thealready transmitted pixels. These methods can in principle also beutilized for the transmission of audio signals.

The invention therefore has at its aim to specify a method fortransmitting audio signals, which operates with minimum losses even atlow transmission bandwidths.

According to the invention the audio signal is first resolved into anumber n of spectral components. The resolved audio signal is stored ina two-dimensional array with a multiplicity of fields, with frequencyand time as the dimensions and the amplitude as the particular value tobe entered in the field. Subsequently from each individual field and atleast two fields adjacent to this field of the array, groups are formed,and to the individual groups a priority is assigned, the priority of agroup being selected higher the greater the amplitudes of the groupvalues are and/or the greater the amplitude differences of the values ofa group are and/or the closer the group is to the current time. Lastly,the groups are transmitted to the receiver in the sequence of theirpriority.

The new method essentially rests on the foundations of Shannon.According to them, the signals can be transmitted free of loss if theyare sampled at the twofold frequency. This means that the sound can beresolved into individual sinusoidal oscillations of different amplitudeand frequency. Accordingly, the acoustic signals can be unambiguouslyrestored without losses by transmitting the individual frequencycomponents, including amplitudes and phases. Herein is in particularutilized that the frequently occurring sound sources, for examplemusical instruments or the human voice, are comprised of resonancebodies, whose resonant frequency does not change at all or only slowly.

Advantageous embodiments and further developments of the invention arespecified in the dependent patent claims.

An embodiment example of the invention will be described in thefollowing. Reference shall be made in particular also to thespecification and the drawing of the earlier patent applications DE 10113 880.6 and DE 101 52 612.1. The two aforementioned applications havebeen used as U.S. Pat. Nos. 7,130,347 and 7,359,560, respectively, andthese U.S. Patents are incorporated by reference as if fully set forthherein.

First, the sound is picked up, converted into electric signals andresolved into its frequency components. This can be carried out eitherthrough FFT (Fast Fourier Transformation) or through n-discretefrequency-selective filters. If n-discrete filters are utilized, eachfilter picks up only a single frequency or a narrow frequency band(similar to the cilia in the human ear). Consequently, there is at eachpoint in time the frequency and the amplitude value at this frequency.The number n can assume different values according to the end deviceproperties. The greater n is, the better the audio signal can bereproduced. n is consequently a parameter with which the quality of theaudio transmission can be scaled.

The amplitude values are placed into intermediate storage in the fieldsof a two-dimensional array.

The first dimension of the array corresponds to the time axis and thesecond dimension to the frequency. Therewith every sampled value withthe particular amplitude value and phase is unambiguously determined andcan be stored in the associated field of the array as an imaginarynumber. The voice signal is consequently represented in three acousticdimensions (parameters) in the array: the time for example inmilliseconds (ms), perceptually discerned as duration as the firstdimension of the array, the frequency in Hertz (Hz), perceptuallydiscerned as tone pitch, as the second dimension of the array and theenergy (or intensity) of the signal, perceptually discerned as volume orintensity, which is stored as a numerical value in the correspondingfield of the array.

In comparison to the applications DE 101 13 880.6 and DE 101 52 612.1,the frequency corresponds for example to the image height, the time tothe image width and the amplitude of the audio signal (intensity) to thecolor value.

Similar to the method of the prioritizing of pixel groups in image/videocoding, groups are formed of adjacent values and these are prioritized.Each field, considered by itself, together with at least one, preferablyhowever several adjacent fields form one group. The groups are comprisedof the position value, defined by time and frequency, the amplitudevalue at the position value, and the amplitude values of the allocatedvalues corresponding to a previously defined form (see FIG. 2 ofapplications DE 101 13 880.6 and DE 101 52 612.1). Especially thosegroups receive a very high priority which are close to the current timeand/or whose amplitude values, in comparison to the other groups, arevery large and/or in which the amplitude values within the group differstrongly. The pixel group values are sorted in descending order andstored or transmitted in this sequence.

The width of the array (time axis) preferably has only a limited extent(for example 5 seconds), i.e. only signal sections of, for example, 5seconds length are always processed. After this time (for example 5seconds) the array is filled with the values of the succeeding signalsections.

The values of the individual groups are received in the receiveraccording to the above described prioritization parameters (amplitude,closeness of position in time and amplitude differences from adjacentvalues).

In the receiver the groups are again entered into a corresponding array.According to patent applications DE 101 13 880.6 and DE 101 52 612.1,subsequently from the transmitted groups the three-dimensional spectralrepresentation can again be generated. The more groups were received,the more precise is the reconstruction. The not yet transmitted arrayvalues are calculated by means of interpolation from the alreadytransmitted array values. From the thus generated array, subsequently inthe receiver a corresponding audio signal is generated whichsubsequently can be converted into sound.

For the synthesis of the audio signal for example n frequency generatorscan be utilized, whose signals are added to an output signal. Throughthis parallel structuring of n generators good scalability is attained.In addition, the clock rate can be drastically reduced through parallelprocessing, such that, due to a lower energy consumption, the playbacktime in mobile end devices is increased. For parallel application forexample FPGAs or ASICs of simple design can be employed.

The described method is not limited to audio signals. The method can beeffectively applied in particular where several sensors (sound sensors,light sensors, tactile sensors, etc.) are utilized, which continuouslymeasure signals which subsequently can be represented in an array (ofnth order).

The advantages compared to previous systems reside in the flexibleapplicability in the case of increased compression rates. By utilizingan array which is supplied from different sources, the synchronizationof the sources is automatically obtained. The correspondingsynchronization in conventional methods must be ensured through specialprotocols, or measures. In particular in video transmission with longpropagation times, for example satellite connections, where sound andimage are transmitted across different channels, frequently a lackingsynchronization of the lips with the voice is noticeable. This can beeliminated through the described method.

Since the same fundamental principle of the prioritizing pixel grouptransmission can be utilized in voice, image and video transmission, astrong synergy effect is utilizable in the implementation. In addition,in this way the simple synchronization between language and images cantake place. In addition, there could be arbitrary scaling between imageand audio resolution.

If an individual audio transmission according to the new method isconsidered, in terms of voice a more natural reproduction results, sincethe frequency components (groups) typical for each human being aretransmitted with highest priority and therewith free of loss.

1. Method of transmitting audio signals between a transmitter and atleast one receiver, comprising the steps of: (a) resolving an audiosignal into a number n of spectral components through a number n offrequency selective filters; (b) storing the resolved audio signals in atwo-dimensional array having a multiplicity of fields, and whereinfrequency and time are stored as dimensions of the array and theamplitude as a particular value to be entered in a field within themultiplicity of fields of the array; (c) combining each field of themultiplicity of fields into a field group wherein there are a pluralityof field groups formed from the multiplicity of fields, and each fieldgroup is formed from at least three adjacent fields; (d) assigning apriority to each group of the plurality of field groups, the priority ofone group over another group becoming greater based upon at least thefunction of the greater the amplitude differences of the values of agroup (e) sorting the field groups of said array with the aid of theirpriority value; (f) storing and/or transmitting the groups to the atleast one receiver in the sequence of their priority; and (g)transmitting said audio signals at low transmission bandwidths so as tominimize transmission losses.
 2. Method as claimed in claim 1,characterized in that the entire audio signal exists as an audio fileand is processed and transmitted in its entirety.
 3. Method as claimedin claim 1, characterized in that only a portion of the audio signal isprocessed and transmitted in each instance.
 4. Method as claimed inclaim 1, characterized in that the audio signal is resolved into itsspectral components by means of FFT.
 5. Method as claimed in claim 1,characterized in that in the receiver the groups transmitted inaccordance with their priority are assigned to a corresponding array,the values of the array still to be transmitted being calculated throughinterpolation from the already available values.
 6. Method as claimed inclaim 1, characterized in that from the existing and calculated valuesin the receiver an electric signal is generated and converted into anaudio signal.
 7. The method of claim 1, wherein said assigning stepfurther comprises assigning a priority to each group of the plurality offield groups, the priority of one group over another group becominggreater based upon utilization of one or more functions selected fromthe group comprising: (i) the greater the amplitudes of the group'svalues; and/or (ii) the closer the group is to the current time.
 8. Themethod of claim 1, wherein said transmission step further comprisestransmitting individual frequency components, wherein said individualfrequency components further comprise amplitudes and phases.
 9. Methodof transmitting signals between a transmitter and at least one receiver,comprising the steps of: (a) resolving a signal into a number n ofspectral components though a number n of frequency selective filters;(b) storing the resolved signals in a two-dimensional array having amultiplicity of fields, and wherein frequency and time are stored asdimensions of the array and the amplitude as a particular value to beentered in a field within the multiplicity of fields of the array; (c)combining each field of the multiplicity of fields into a field groupwherein there are a plurality of field groups formed from themultiplicity of fields, and each field group is formed from at leastthee adjacent fields; (d) assigning a priority to each group of theplurality of field groups, the priority of one group over another groupbecoming greater based upon at least the function of the greater theamplitude differences of the values of a group; (e) sorting the fieldgroups of said array with the aid of their priority value; (f) storingand/or transmitting the groups to the at least one receiver in thesequence of their priority; and (g) transmitting said signals at lowtransmission bandwidths so as to minimize transmission losses.